A tunable collective communication framework on a cluster of SMPS
نویسندگان
چکیده
In this paper we investigate a tunable MPI collective communications library on a cluster of SMPs. Most tunable collective communications libraries select optimal algorithms for inter-node communication on a given platform. We add another layer of intra-node communications composed by several tunable shared memory operations. We explore the advantages of our approach, and discuss when to use our approach, when to switch to another approach on the shared memory layer. Experimental results indicate that collective communications designed by such an approach with proper tuning can outperform vendor implementations.
منابع مشابه
A planning framework for post-disaster collective shelters
Background and objective: Accommodation of the victims is necessary immediately after the accident to maintain order and reduce mental, psychological and physical crises, as well as providing better services to the victims of the accident. The experience of the AQ-Qala flood and the temporary accommodation of the victims in sports venues has made the need for the design of a collective shelter ...
متن کاملSCORE on Millennium Cluster A data-flow programming environment for a cluster of SMPs
This work presents SCORE for Millennium, a data-flow programming environment adapted for a cluster of SMPs. Programs expressed as data-flow graphs expose inter-operator parallelism and dependencies, which permit a runtime system to analyze and improve their performance automatically, without human intervention. Here we describe the implementation of the infrastructure to support such a programm...
متن کاملTechnische Universität Chemnitz Sonderforschungsbereich 393 Numerische Simulation auf massiv parallelen Rechnern
The characteristics of irregular algorithms make a parallel implementation difficult, especially for PC clusters or clusters of SMPs. These characteristics may include an unpredictable access behavior to dynamically changing data structures or strong irregular coupling of computations. Problems are an unknown load distribution and expensive irregular communication patterns for data accesses and...
متن کاملSorting on Clusters of SMPs
We introduce an efficient algorithm for sorting on clusters of symmetric multiprocessors (SMPs). This algorithm relies on a novel scheme for stably sorting on a single SMP coupled with balanced regular communication on the cluster. Our SMP algorithm seems to be asymptotically faster than any of the published algorithms. The algorithms were implemented in C using POSIX threads and the SIMPLE lib...
متن کاملOrthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters
Optimization of access patterns using collective I/O imposes the overhead of exchanging data between processes. In a multi-core-based cluster the costs of inter-node and intra-node data communication are vastly different, and heterogeneity in the efficiency of data exchange poses both a challenge and an opportunity for implementing efficient collective I/O. The opportunity is to effectively exp...
متن کامل